Last updated: 2021-01-29

Checks: 6 1

Knit directory: factor_analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20200623) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
/project2/xinhe/xsun/website/factor_analysis/output/sum_pliercanon_wgpc_ldcl_d1k_ageco.rdata output/sum_pliercanon_wgpc_ldcl_d1k_ageco.rdata
/project2/xinhe/xsun/website/factor_analysis/output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k_ageco.rdata output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k_ageco.rdata
/project2/xinhe/xsun/website/factor_analysis/output/info_pval5e8_pliercanon_ld_d1k_ageco.rdata output/info_pval5e8_pliercanon_ld_d1k_ageco.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv27top025_pliercanon.rdata output/lv27top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv76top025_pliercanon.rdata output/lv76top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv90top025_pliercanon.rdata output/lv90top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv49top025_pliercanon.rdata output/lv49top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv82top025_pliercanon.rdata output/lv82top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv36top025_pliercanon.rdata output/lv36top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv39top025_pliercanon.rdata output/lv39top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv6top025_pliercanon.rdata output/lv6top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv119top025_pliercanon.rdata output/lv119top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv23top025_pliercanon.rdata output/lv23top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv26top025_pliercanon.rdata output/lv26top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv125top025_pliercanon.rdata output/lv125top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv47top025_pliercanon.rdata output/lv47top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_27_53759010.coloc_pliercanon_d1k_500.rdata output/bmi_27_53759010.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_27_85758440.coloc_pliercanon_d1k_500.rdata output/bmi_27_85758440.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_76_40380914.coloc_pliercanon_d1k_500.rdata output/bmi_76_40380914.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_76_43842728.coloc_pliercanon_d1k_500.rdata output/bmi_76_43842728.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_160687504.coloc_pliercanon_d1k_500.rdata output/LDL_125_160687504.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_44943964.coloc_pliercanon_d1k_500.rdata output/LDL_125_44943964.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_56955678.coloc_pliercanon_d1k_500.rdata output/LDL_125_56955678.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_56965346.coloc_pliercanon_d1k_500.rdata output/LDL_125_56965346.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lymph_26_41686157.coloc_pliercanon_d1k_500.rdata output/lymph_26_41686157.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lymph_26_59258463.coloc_pliercanon_d1k_500.rdata output/lymph_26_59258463.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_34414160.coloc_pliercanon_d1k_500.rdata output/plt_49_34414160.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_56786080.coloc_pliercanon_d1k_500.rdata output/plt_49_56786080.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_88978375.coloc_pliercanon_d1k_500.rdata output/plt_49_88978375.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_32506416.coloc_pliercanon_d1k_500.rdata output/rbc_82_32506416.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_42718545.coloc_pliercanon_d1k_500.rdata output/rbc_82_42718545.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_54482753.coloc_pliercanon_d1k_500.rdata output/rbc_82_54482753.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_8614076.coloc_pliercanon_d1k_500.rdata output/rbc_82_8614076.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_6_129655816.coloc_pliercanon_d1k_500.rdata output/wbc_6_129655816.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_6_81575948.coloc_pliercanon_d1k_500.rdata output/wbc_6_81575948.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_119_16589041.coloc_pliercanon_d1k_500.rdata output/wbc_119_16589041.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_119_31355703.coloc_pliercanon_d1k_500.rdata output/wbc_119_31355703.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/WHR_47_53955989.coloc_pliercanon_d1k_500.rdata output/WHR_47_53955989.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/WHR_47_6742916.coloc_pliercanon_d1k_500.rdata output/WHR_47_6742916.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k.rdata output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/info_pval5e8_pliercanon_ld_d1k.rdata output/info_pval5e8_pliercanon_ld_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k_noageco.rdata output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k_noageco.rdata
/project2/xinhe/xsun/website/factor_analysis/output/info_pval5e8_pliercanon_ld_d1k_noageco.rdata output/info_pval5e8_pliercanon_ld_d1k_noageco.rdata

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 8f0bf57. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData
    Ignored:    .Rhistory
    Ignored:    analysis/.Rhistory

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd) and HTML (docs/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 8f0bf57 XSun 2021-01-29 update
html 47bf801 XSun 2020-12-03 Build site.
Rmd 276bb01 XSun 2020-12-03 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html e91abdf XSun 2020-12-03 Build site.
Rmd d268881 XSun 2020-12-03 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 1237bb7 XSun 2020-12-03 Build site.
Rmd 2a146c4 XSun 2020-12-03 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd 8c5f609 XSun 2020-12-01 update
html b2114ce XSun 2020-12-01 Build site.
Rmd 5615ac5 XSun 2020-12-01 update
html d33bc44 XSun 2020-11-30 Build site.
Rmd c7567a3 XSun 2020-11-30 wflow_publish(c(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000_adipose_sub.Rmd”,
html c02b975 XSun 2020-11-30 Build site.
html d75fbad XSun 2020-11-27 Build site.
Rmd 7b3c083 XSun 2020-11-27 update
html aba60f8 XSun 2020-11-26 Build site.
Rmd 0accf45 XSun 2020-11-26 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 10d1a91 XSun 2020-11-26 Build site.
Rmd 49ce40a XSun 2020-11-26 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html ec70427 XSun 2020-11-26 Build site.
Rmd 0cbeceb XSun 2020-11-26 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 00cd6e2 XSun 2020-11-26 Build site.
Rmd f78dde7 XSun 2020-11-26 update
Rmd d77eaf4 XSun 2020-11-23 update
html 434ac8c XSun 2020-11-23 Build site.
html 003483f XSun 2020-11-23 Build site.
Rmd 500542d XSun 2020-11-23 update
html 49c92ad XSun 2020-11-23 Build site.
Rmd 691daa9 XSun 2020-11-23 update
html c729bb2 XSun 2020-11-21 Build site.
html fc29732 XSun 2020-11-21 Build site.
Rmd 50faac0 XSun 2020-11-21 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 5ffe456 XSun 2020-11-20 Build site.
Rmd 1349673 XSun 2020-11-20 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 182a46e XSun 2020-11-20 Build site.
Rmd 1f761b4 XSun 2020-11-20 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd e947318 XSun 2020-11-19 update
html 36b155f XSun 2020-11-18 Build site.
Rmd 8dcca15 XSun 2020-11-18 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html dead226 XSun 2020-11-18 Build site.
Rmd 818deca XSun 2020-11-18 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd 0a5f8b5 XSun 2020-11-18 update
html da71804 XSun 2020-11-02 Build site.
Rmd 02226ad XSun 2020-11-02 update
Rmd 377743f XSun 2020-11-02 update
html fb88efd XSun 2020-10-29 Build site.
Rmd bcb9478 XSun 2020-10-29 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 1ea5535 XSun 2020-10-29 Build site.
Rmd b26e318 XSun 2020-10-29 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 23966ec XSun 2020-10-28 Build site.
Rmd 5346342 XSun 2020-10-28 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 9f1cff7 XSun 2020-10-28 Build site.
Rmd da59019 XSun 2020-10-28 update
html 77a2489 XSun 2020-10-28 Build site.
Rmd cd10457 XSun 2020-10-28 update
Rmd f265455 XSun 2020-10-28 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html a14cfaa XSun 2020-10-28 Build site.
Rmd 58fbdd2 XSun 2020-10-28 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 8e64234 XSun 2020-10-24 Build site.
Rmd ef67b09 XSun 2020-10-24 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 4793c9c XSun 2020-10-24 Build site.
Rmd 05cf21b XSun 2020-10-24 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 467f893 XSun 2020-10-24 Build site.
Rmd 021b0b4 XSun 2020-10-24 update
html 6b65c0c XSun 2020-10-22 Build site.
Rmd d13e337 XSun 2020-10-22 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 8ba3ee9 XSun 2020-10-22 Build site.
Rmd aa16fcb XSun 2020-10-22 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html baa1f02 XSun 2020-10-22 Build site.
Rmd 2dd03ae XSun 2020-10-22 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 36ea64d XSun 2020-10-22 Build site.
Rmd 19694ce XSun 2020-10-22 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd 6f6c907 XSun 2020-10-22 update
html 1cc9328 XSun 2020-10-22 Build site.
Rmd b94cb1b XSun 2020-10-22 update
html 00f41ae XSun 2020-10-21 Build site.
Rmd 312defa XSun 2020-10-21 update
Rmd 9505772 XSun 2020-10-19 update
html 3567877 XSun 2020-10-19 Build site.
Rmd db41780 XSun 2020-10-19 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd b197942 XSun 2020-10-19 update
html 3250458 XSun 2020-10-19 Build site.
Rmd d04088b XSun 2020-10-19 update
html 3d0c708 XSun 2020-10-19 Build site.
Rmd 2315fbc XSun 2020-10-19 update
html acfd22d XSun 2020-10-18 Build site.
Rmd 6218b87 XSun 2020-10-18 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 1327f34 XSun 2020-10-18 Build site.
Rmd bc3dc2d XSun 2020-10-18 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html f033bab XSun 2020-10-18 Build site.
Rmd 9535d29 XSun 2020-10-18 update
html d11838b XSun 2020-10-16 Build site.
Rmd bddde1b XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 39c0f34 XSun 2020-10-16 Build site.
Rmd 73f6e34 XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html e1e20c9 XSun 2020-10-16 Build site.
Rmd c10d340 XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 83c7381 XSun 2020-10-16 Build site.
Rmd 8407ce3 XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 5769bae XSun 2020-10-16 Build site.
Rmd a13fb68 XSun 2020-10-16 update
html 6c0de70 XSun 2020-10-16 Build site.
html 00d36e3 XSun 2020-10-16 Build site.
Rmd a615ad1 XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 28aea11 XSun 2020-10-15 Build site.
Rmd fa849c5 XSun 2020-10-15 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 0046ff2 XSun 2020-10-14 Build site.
Rmd 14e53c8 XSun 2020-10-14 update

Introduction

In this part, I considered the traits separately. I selected the SNPs with pval < 5e-8 for each traits. Then, I did LD Clumping for these SNPs to eliminate the LD and select a smaller subset of SNPs. After that, I did association tests for the plier_canonical factors with the SNPs. I also did colozalization analysis for some significant factor~snp pairs.

Material and Methods

  1. Catalog GWAS data:
  • platelet count (plt), white blood cell count, myeloid white cell count (myeloid wbc), lymphocyte count, red blood cell count(wbc), granulocyte count(gran), eosinophil count(eo), neutrophil count (neut) from Astle WJ, Elding H, Jiang T, et al. The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease. Cell. 2016;167(5):1415-1429.e19. doi:10.1016/j.cell.2016.10.042.

  • T2D. I first used data from our lab collaction Morris et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet. 2012 Sep;44(9):981-90. doi: 10.1038/ng.2383. Epub 2012 Aug 12. PMID: 22885922; PMCID: PMC3442244. but it doesn’t contain MAF info of variants.So I added this from GWAS Catalog: Wood AR et al. Variants in the FTO and CDKAL1 loci have recessive effects on risk of obesity and type 2 diabetes, respectively. Diabetologia. 2016 Jun;59(6):1214-21. doi: 10.1007/s00125-016-3908-5. Epub 2016 Mar 10. PMID: 26961502; PMCID: PMC4869698..

  • T1D data are also from lab collection: Onengut-Gumuscu S et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat Genet. 2015 Apr;47(4):381-6. doi: 10.1038/ng.3245. Epub 2015 Mar 9. PMID: 25751624; PMCID: PMC4380767.

  • Asthma. I first used data from our lab collaction Zhu et al, Shared Genetics of Asthma and Mental Health Disorders: A Large-Scale Genome-Wide Cross-Trait Analysis. European Respiratory Journal, 2019 (PMID: 31619474) but it doesn’t contain MAF info of variants.So I added this from GWAS Catalog: Manuel A.R. et al. Genetic Architectures of Childhood- and Adult-Onset Asthma Are Partly Distinct,The American Journal of Human Genetics,Volume 104, Issue 4,2019,Pages 665-684,ISSN 0002-9297,https://doi.org/10.1016/j.ajhg.2019.02.022..

  • IBD,Ulcerative colitist(UC),Crohn’s disease(CD) data are from lab collection Liu, van Sommeren et al, Nature Genetics, 2015

  • Waist-hip ratio data are from Shungin et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature. 2015 Feb 12;518(7538):187-196. doi: 10.1038/nature14132. PMID: 25673412; PMCID: PMC4338562.

  • BMI data are also from lab collection: Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, Powell C, Vedantam S, Buchkovich ML, Yang J, Croteau-Chonka DC, Esko T et al. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197-206

  • HDL & LDL & TG &TC data are downloaded from GWAS Catalog. Global Lipids Genetics Consortium., Willer, C., Schmidt, E. et al. Discovery and refinement of loci associated with lipid levels. Nat Genet 45, 12741283 (2013). https://doi.org/10.1038/ng.2797

  • Height data are from lab collection: Yengo L, Sidorenko J, Kemper KE, et al. Meta-analysis of genome-wide association studies for height and body mass index in <700000 individuals of European ancestry. Hum Mol Genet. 2018;27(20):3641-3649. doi:10.1093/hmg/ddy271

  • Schizophrenia data are from lab collection: Schizophrenia Working Group of the Psychiatric Genomics Consortium., Ripke, S., Neale, B. et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421427 (2014). https://doi.org/10.1038/nature13595

  • Vitiligo data are from lab collection: Jin Y et al. Genome-wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants. Nat Genet. 2016 Nov;48(11):1418-1424. doi: 10.1038/ng.3680. Epub 2016 Oct 10. PMID: 27723757; PMCID: PMC5120758.

  • Allergy data are from lab collection: Ferreira MA et al. Shared genetic origin of asthma, hay fever and eczema elucidates allergic disease biology. Nat Genet. 2017 Dec;49(12):1752-1757. doi: 10.1038/ng.3985. Epub 2017 Oct 30. PMID: 29083406; PMCID: PMC5989923.

  1. Filtered the SNPs using pval < 5e-8 as cut off from the GWAS Catalog data.

  2. Did LD Clumping for the SNPs in step 2. The PLINK LD Clumping patameters are:

    –clump-p1 0.0001 Significance threshold for index SNPs

    –clump-p2 0.01 Secondary significance threshold for clumped SNPs

    –clump-r2 0.1 LD threshold for clumping

    –clump-kb 1000 Physical distance threshold for clumping

For each trait, I got a subset of SNPs that are not in LD with each other.

  1. Did association tests for plier_canonical factors and SNPs in 3. The association tests were corrected by 1)10 genotype PCs of whole genome; 2)10 PCs + GTEx:Sequencing platform,Sequencing protocol,Sex; 3)10 PCs + GTEx:Sequencing platform,Sequencing protocol,Sex + AGE

  2. For each trait, I made a plot of association with LV(indicating by beta in GWAS) vs association with trait(indicating by ln(odds ratio) or beta in GWAS) to show if the variants have the correlated effect direction. The effect sizes of Catalog GWAS and factor association tests are harmonized by TwoSampleMR R package to make the effect alleles in these two analysis identical. The LVs have more than one significant SNPs with FDR<0.2 are included in the plotting.Besides, for each plots, I fitted the points with intercept = 0. The pvalues and r-squared are shown on the plots.

  3. For the traits and LVs in 5, I made an info table to show more details of the SNPs.

  4. For several LVs we are interested in, I did gene set enrichment analysis to test if the LVs are correlated with some KEGG/REACTOME pathways. I used two kind gene sets to do GSEA: 1. genes that used to compute LVs; 2. Sorting the genes in 1 by their loadings, take the top 25% as the gene set. For both gene sets, the gene scores used as input of GSEA are the gene loadings.

  5. Resampling. For some promising trait-factor pairs, I did resampling. I resampled the SNPs without replacement, I fitted the points with intercept = 0 again and recorded the pvalues and r-squared. The resampling was repeated 1000 times. The following plots are the resampling results.

Snps after filter

After filtering by ‘pval < 5e-8’ and LD Clumping, for each trait, I got :

platelet count (plt) trait contains 688 SNPs with pval<5e-8.

white blood cell count(wbc)trait contains 368 SNPs with pval<5e-8.

myeloid white cell count (myeloid wbc) trait contains 319 SNPs with pval<5e-8.

lymphocyte count(lymph) trait contains 436 SNPs with pval<5e-8.

red blood cell count(wbc) trait contains 466 SNPs with pval<5e-8.

granulocyte count(gran) trait contains 316 SNPs with pval<5e-8.

eosinophil count(eo) trait contains 491 SNPs with pval<5e-8.

neutrophil count (neut) trait contains 317 SNPs with pval<5e-8.

IBD trait contains 116 SNPs with pval<5e-8.

Ulcerative colitist(UC) trait contains 73 SNPs with pval<5e-8.

Crohn’s disease(CD) trait contains 96 SNPs with pval<5e-8.

BMI trait contains 104 SNPs with pval<5e-8.

T2D contains 14 SNPs with pval<5e-8. T2D_2 contains 4 SNPs with pval<5e-8.

Asthma trait contains 186 SNPs with pval<5e-8. Asthma_2 trait contains 112 SNPs with pval<5e-8.

HDL trait contains 227 SNPs with pval<5e-8.

LDL trait contains 204 SNPs with pval<5e-8.

WHR trait contains 36 SNPs with pval<5e-8.

TC trait contains 250 SNPs with pval<5e-8.

TG trait contains 155 SNPs with pval<5e-8.

Height trait contains 5846 SNPs with pval<5e-8.

SCZ trait contains 135 SNPs with pval<5e-8.

Vitiligo trait contains 101 SNPs with pval<5e-8.

T1D trait contains 69 SNPs with pval<5e-8.

Allergy trait contains 158 SNPs with pval<5e-8.

Results - pval < 5e-8 & association test covariants: 10 PCs + GTEx:Sequencing platform,Sequencing protocol,Sex + AGE

Summary table

I used ‘qvalue’ R package to compute the fdr from p-values for each SNP and made a table to show the number of SNPs that pass the threshold. The thresholds are ‘fdr < 0.1’,‘fdr < 0.2’,‘pval < 5e-8’. The ‘num_significant_pairs’ indicates the number of significant pairs under each threshold. If a trait~factor pair has as least 1 significant SNP, we named it as ‘significant pair’.

number of significant trait-LV associations across all traits
x
num_sigpairs_fdr010 146
num_sigpairs_fdr020 383

Info table

For each trait, I made a table to show the info of snps with fdr>0.2 in the factor ~ SNP + genotype pcs association test. For each trait,The LVs have more than one significant SNPs with FDR<0.2 are included.

The suffix ’_assoc’ here means that results are from factor ~ SNP + genotype pcs association test. The suffix ’_gwas’ here means results are from original GWAS results files. For EUR.CD, EUR.IBD, EUR.UC,T2D, asthma, the effectsize_gwas here means ‘ln(OR)’, for others, it means ‘beta’.

‘snp_ld’ here means the snps that in LD with the snp in each line.’ld_r2’ means the LD r-squared which is corresponding to the ‘snp_ld’ column. ‘cis-eqtl’ column indicates whether the snp is a cis-eqtl according to GTEx data. ‘cis_gene_hgnc’ and ‘cis_gene_hgnc’ is the genes that the snp influence when it act as cis-eqtl. ‘func’ and ‘func_gene’ are obtained from ANNOVAR, which indicating the snp function within the genes.

Enrichment analysis

For some promising trait-factor pairs (i.e. BMI-LV3, LV27, RBC-LV82, AsthmaLV68, WBC-LV119, Lymphocyte-LV23, LV78), I did enrichment analysis with WebGestalt. The analysis are under different settings:

  1. ORA: For each factor, I sorted the genes that have non-zero loadings by their loadings, taking the top 25% as the gene set. The function database here are Reactome pathway; Disgenet + GLAD4U + OMIM disease dataset; geneontology biological process. The reference set affy hugene 2 0 st v1. Minimum number of genes for a category is 5, maximum number of genes for a category is 2000(default settings). All categories that have fdr<0.01 are listed.

BMI-LV27

BMI-LV76

BMI-LV90

PLT-LV49

RBC-LV82

Asthma-LV36

Asthma-LV39

WBC-LV6

Warning in instance$preRenderHook(instance): It seems your data is too big
for client-side DataTables. You may consider server-side processing: https://
rstudio.github.io/DT/server.html

WBC-LV119

Lymphocyte-LV23

Lymphocyte-LV26

LDL-LV125

WHR

Effect size plots & Resampling

For each trait, I made a plot of association with LV(indicating by beta in GWAS) vs association with trait(indicating by ln(odds ratio) or beta in GWAS) to show if the variants have the correlated effect direction. The effect sizes of Catalog GWAS and factor association tests are harmonized by TwoSampleMR R package to make the effect alleles in these two analysis identical. The LVs have more than one significant SNPs with FDR<0.2 are included in the plotting.Besides, for each plots, I fitted the points with intercept = 0. The pvalues and r-squared are shown on the plots.

I also relaxed the fdr threshold of the SNPs that used to make effect size plots(from 0.2 to 0.3/0.5)

For all pairs , I did resampling. I resampled the SNPs without replacement and fitted the points with intercept = 0 again and recorded the pvalues and r-squared. The resampling was repeated 1000 times.

I made a histogram to show the rsquared distribution from resampling. The red line in the plots are the rsquared in the origin analysis. The r_mean values in the plots are the mean values of rsquared in point fitting. The ‘p-value from resampling’ is computed by: (number of more extreme values)/(times of resampling).

Allergy

LV33

fdr0.2

fdr0.3

fdr0.5

LV74

fdr0.2

fdr0.3

fdr0.5

LV110

fdr0.2

fdr0.3

fdr0.5

Asthma_2

LV88

fdr0.2

fdr0.3

fdr0.5

LV91

fdr0.2

fdr0.3

fdr0.5

BMI

LV27

fdr0.2

fdr0.3

fdr0.5

LV30

fdr0.2

fdr0.3

fdr0.5

LV76

fdr0.2

fdr0.3

fdr0.5

LV90

fdr0.2

fdr0.3

fdr0.5

eosinophil count(eo)

LV106

fdr0.2

fdr0.3

fdr0.5

LV125

fdr0.2

fdr0.3

fdr0.5

Crohn’s disease(CD)

LV106

fdr0.2

fdr0.3

fdr0.5

IBD

LV98

fdr0.2

fdr0.3

fdr0.5

Ulcerative colitist(UC)

LV23

fdr0.2

fdr0.3

fdr0.5

LV24

fdr0.2

fdr0.3

fdr0.5

LV108

fdr0.2

fdr0.3

fdr0.5

HDL

LV3

fdr0.2

fdr0.3

fdr0.5

LV28

fdr0.2

fdr0.3

fdr0.5

LV69

fdr0.2

fdr0.3

fdr0.5

LV72

fdr0.2

fdr0.3

fdr0.5

LV101

fdr0.2

fdr0.3

fdr0.5

Height

LV24

fdr0.2

fdr0.3

fdr0.5

LV87

fdr0.2

fdr0.3

fdr0.5

LV96

fdr0.2

fdr0.3

fdr0.5

LDL

LV23

fdr0.2

fdr0.3

fdr0.5

LV79

fdr0.2

fdr0.3

fdr0.5

LV113

fdr0.2

fdr0.3

fdr0.5

LV117

fdr0.2

fdr0.3

fdr0.5

LV125

fdr0.2

fdr0.3

fdr0.5

TG

LV30

fdr0.2

fdr0.3

fdr0.5

granulocyte count(gran)

None of the LVs have >1 SNPs at FDR<0.2.

lymphocyte count(lymph)

LV23

fdr0.2

fdr0.3

fdr0.5

LV26

fdr0.2

fdr0.3

fdr0.5

LV37

fdr0.2

fdr0.3

fdr0.5

LV65

fdr0.2

fdr0.3

fdr0.5

LV123

fdr0.2

fdr0.3

fdr0.5

myeloid white cell count (myeloid wbc)

LV65

fdr0.2

fdr0.3

fdr0.5

LV94

fdr0.2

fdr0.3

fdr0.5

LV119

fdr0.2

fdr0.3

fdr0.5

neutrophil count (neut)

LV6

fdr0.2

fdr0.3

fdr0.5

LV95

fdr0.2

fdr0.3

fdr0.5

plt

LV49

fdr0.2

fdr0.3

fdr0.5

LV97

fdr0.2

fdr0.3

fdr0.5

rbc

LV3

fdr0.2

fdr0.3

fdr0.5

LV20

fdr0.2

fdr0.3

fdr0.5

LV82

fdr0.2

fdr0.3

fdr0.5

LV118

fdr0.2

fdr0.3

fdr0.5

SCZ

LV7

fdr0.2

fdr0.3

fdr0.5

LV98

fdr0.2

fdr0.3

fdr0.5

LV100

fdr0.2

fdr0.3

fdr0.5

LV118

fdr0.2

fdr0.3

fdr0.5

T1D

LV30

fdr0.2

fdr0.3

fdr0.5

T2D_2

Since there were only 4 SNPs left after filtering by p-values in GWAS summary data and all of them have fdr<0.2 in association test. So there is no other SNPs to do resampling.

T2D

LV1

fdr0.2

fdr0.3

fdr0.5

TC

LV20

fdr0.2

fdr0.3

fdr0.5

LV23

fdr0.2

fdr0.3

fdr0.5

LV125

fdr0.2

fdr0.3

fdr0.5

TG

LV30

fdr0.2

fdr0.3

fdr0.5

asthma

LV21

fdr0.2

fdr0.3

fdr0.5

LV36

fdr0.2

fdr0.3

fdr0.5

LV39

fdr0.2

fdr0.3

fdr0.5

LV68

fdr0.2

fdr0.3

fdr0.5

LV82

fdr0.2

fdr0.3

fdr0.5

Vitiligo

LV35

fdr0.2

fdr0.3

fdr0.5

LV69

fdr0.2

fdr0.3

fdr0.5

LV82

fdr0.2

fdr0.3

fdr0.5

LV97

fdr0.2

fdr0.3

fdr0.5

LV109

fdr0.2

fdr0.3

fdr0.5

wbc

LV6

fdr0.2

fdr0.3

fdr0.5

LV119

fdr0.2

fdr0.3

fdr0.5

WHR

LV1

fdr0.2

fdr0.3

fdr0.5

LV47

fdr0.2

fdr0.3

fdr0.5

LV82

fdr0.2

fdr0.3

fdr0.5

LV106

fdr0.2

fdr0.3

fdr0.5

Effectsize summary

To summarize the results above, I made a scatter plot of all trait-LV pairs. The x-axis shows the R2 in fitting, and the y-axis show the -log10-p values from resampling. Below are plots for three FDR cutoffs, 0.2, 0.3 and 0.5. To make the plot more informative, Each point represent a trait-LV pair. Different colors show the number of SNPs passing the FDR threshold for each pair.

Summary Table

Effect size plots - checking the reverse causality (19 traits)

To check if the effect size correlation is due to reverse causality: i.e. trait -> LV (trait causally affect LV), instead of LV -> trait (which is what we like to see). I used all SNPs associated with traits(pval<5E-8). The x-axis is the effects of these SNPs on trait, and y-axis is the effects on LV.

Some pair show p < 0.05, the result may be driven by the possible causal effect of LV -> trait. To test this, I removed the SNPs that are associated with LVs at FDR < 0.2 and made the plots again.

BMI

LDL

lymphocyte count(lymph)

platelet count (plt)

red blood cell count(wbc)

asthma

white blood cell count

WHR

QQplots (19 traits)

BMI

LV27, LV76 and LV90

LDL

LV125


platelet count (plt)

LV49

red blood cell count(wbc)

LV82

asthma

LV36 and LV79

white blood cell count

LV6 and LV119

WHR

LV47

Colocalization – 500kb (19 traits)

The colocalization analysis was performed using the approximate Bayes factor test implemented in the Coloc package. Coloc computes five posterior probabilities (PP0, PP1, PP2, PP3 and PP4), each corresponding to a hypothesis: H0, no association with either trait; H1, association with trait 1 but not with trait 2; H2, association with trait 2 but not with trait 1; H3, association with trait 1 and trait 2, two independent SNPs; H4, association with trait 1 and trait 2, one shared SNP. We ran Coloc with the default parameters and used PP4 to assess evidence of colocalization. We visualized the colocalization of factor - QTLs and GWAS associations using the LocusCompareR package.

SNP selection: 1. Chose the SNPs in the info table. 2. For each SNP, the region used in colocalization analysis is between [pos-100kb, pos+100kb]. 3. All SNPs in this region are included in alalysis.

Summary Table

BMI

LV27

PPs in coloclization analysis
note
nsnps 499 NA
PP.H0.abf 7.93444900060162e-146 no association with either trait
PP.H1.abf 4.61967175649592e-147 association with trait 1 but not with trait 2
PP.H2.abf 0.489837805633518 association with trait 2 but not with trait 1
PP.H3.abf 0.0280376867076828 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.482124507658802 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 458 NA
PP.H0.abf 3.61095771264185e-09 no association with either trait
PP.H1.abf 3.83722698846073e-10 association with trait 1 but not with trait 2
PP.H2.abf 0.301833782768097 association with trait 2 but not with trait 1
PP.H3.abf 0.0314079662287754 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.666758247008448 association with trait 1 and trait 2, one shared SNP

LV76

PPs in coloclization analysis
note
nsnps 750 NA
PP.H0.abf 0.00544055184691358 no association with either trait
PP.H1.abf 0.000612167269311149 association with trait 1 but not with trait 2
PP.H2.abf 0.103096274108392 association with trait 2 but not with trait 1
PP.H3.abf 0.0107201932662791 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.880130813509105 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 551 NA
PP.H0.abf 0.00776682662078695 no association with either trait
PP.H1.abf 0.0040703879647715 association with trait 1 but not with trait 2
PP.H2.abf 0.106905431720179 association with trait 2 but not with trait 1
PP.H3.abf 0.0552002460259568 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.826057107668306 association with trait 1 and trait 2, one shared SNP

LDL

LV125

PPs in coloclization analysis
note
nsnps 442 NA
PP.H0.abf 1.11005189466216e-09 no association with either trait
PP.H1.abf 9.92186252263669e-11 association with trait 1 but not with trait 2
PP.H2.abf 0.689436970027545 association with trait 2 but not with trait 1
PP.H3.abf 0.0613740456379771 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.249188983125208 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 223 NA
PP.H0.abf 3.30557717943082e-207 no association with either trait
PP.H1.abf 4.95611425569479e-209 association with trait 1 but not with trait 2
PP.H2.abf 0.941105250955241 association with trait 2 but not with trait 1
PP.H3.abf 0.0140653392451396 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0448294097996179 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 479 NA
PP.H0.abf 2.75600123046051e-31 no association with either trait
PP.H1.abf 2.57382238436189e-31 association with trait 1 but not with trait 2
PP.H2.abf 0.158779711374355 association with trait 2 but not with trait 1
PP.H3.abf 0.147590329714633 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.693629958911014 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 477 NA
PP.H0.abf 2.75603750829864e-31 no association with either trait
PP.H1.abf 2.57362778508749e-31 association with trait 1 but not with trait 2
PP.H2.abf 0.158781801425914 association with trait 2 but not with trait 1
PP.H3.abf 0.147579109262451 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.693639089311643 association with trait 1 and trait 2, one shared SNP

lymphocyte count(lymph)

LV26

PPs in coloclization
note
nsnps 1898 NA
PP.H0.abf 5.65594665246378e-07 no association with either trait
PP.H1.abf 3.59287155267202e-07 association with trait 1 but not with trait 2
PP.H2.abf 0.491638980696366 association with trait 2 but not with trait 1
PP.H3.abf 0.312111454467391 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.196248639954423 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 1530 NA
PP.H0.abf 0.000143508683979133 no association with either trait
PP.H1.abf 6.82191082491261e-05 association with trait 1 but not with trait 2
PP.H2.abf 0.547934969500356 association with trait 2 but not with trait 1
PP.H3.abf 0.260277923478846 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.19157537922857 association with trait 1 and trait 2, one shared SNP

platelet count (plt)

LV49

PPs in coloclization
note
nsnps 1531 NA
PP.H0.abf 2.77982713819106e-09 no association with either trait
PP.H1.abf 9.18877382567472e-10 association with trait 1 but not with trait 2
PP.H2.abf 0.707061726525701 association with trait 2 but not with trait 1
PP.H3.abf 0.233661381370352 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0592768884052437 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 1369 NA
PP.H0.abf 7.10298463836551e-08 no association with either trait
PP.H1.abf 1.69842486167328e-07 association with trait 1 but not with trait 2
PP.H2.abf 0.0903121845742953 association with trait 2 but not with trait 1
PP.H3.abf 0.215254872276519 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.694432702276853 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 1982 NA
PP.H0.abf 8.11653188178442e-16 no association with either trait
PP.H1.abf 2.15134459755577e-16 association with trait 1 but not with trait 2
PP.H2.abf 0.745363911198891 association with trait 2 but not with trait 1
PP.H3.abf 0.197506885443035 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0571292033580722 association with trait 1 and trait 2, one shared SNP

red blood cell count(wbc)

LV82

PPs in coloclization
note
nsnps 2595 NA
PP.H0.abf 4.21930207358819e-08 no association with either trait
PP.H1.abf 1.09071966253847e-08 association with trait 1 but not with trait 2
PP.H2.abf 0.716838029725225 association with trait 2 but not with trait 1
PP.H3.abf 0.185209788402761 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0979521287717977 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 2104 NA
PP.H0.abf 1.34092817645511e-17 no association with either trait
PP.H1.abf 3.69731390584295e-18 association with trait 1 but not with trait 2
PP.H2.abf 0.198488853172484 association with trait 2 but not with trait 1
PP.H3.abf 0.0539813968155799 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.747529750011933 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 1772 NA
PP.H0.abf 3.26465833016418e-141 no association with either trait
PP.H1.abf 9.94717668727846e-142 association with trait 1 but not with trait 2
PP.H2.abf 0.741687836885181 association with trait 2 but not with trait 1
PP.H3.abf 0.225954536381002 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0323576267338294 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 1732 NA
PP.H0.abf 4.17309901957295e-05 no association with either trait
PP.H1.abf 1.14195750828672e-05 association with trait 1 but not with trait 2
PP.H2.abf 0.0831749398201746 association with trait 2 but not with trait 1
PP.H3.abf 0.0218656960591087 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.894906213555439 association with trait 1 and trait 2, one shared SNP

white blood cell count

LV6

PPs in coloclization
note
nsnps 1692 NA
PP.H0.abf 2.42069422133774e-20 no association with either trait
PP.H1.abf 5.85832479206917e-21 association with trait 1 but not with trait 2
PP.H2.abf 0.752317669664185 association with trait 2 but not with trait 1
PP.H3.abf 0.18200280383898 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0656795264968316 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 2520 NA
PP.H0.abf 0.00167804634271394 no association with either trait
PP.H1.abf 0.000518622191341393 association with trait 1 but not with trait 2
PP.H2.abf 0.190540023295325 association with trait 2 but not with trait 1
PP.H3.abf 0.0581397652226869 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.749123542947933 association with trait 1 and trait 2, one shared SNP

LV119

PPs in coloclization
note
nsnps 1472 NA
PP.H0.abf 1.31758242169431e-23 no association with either trait
PP.H1.abf 6.74125533437269e-24 association with trait 1 but not with trait 2
PP.H2.abf 0.639020510048892 association with trait 2 but not with trait 1
PP.H3.abf 0.326913254190781 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.034066235760328 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 512 NA
PP.H0.abf 8.45984458247953e-94 no association with either trait
PP.H1.abf 4.53826813191688e-95 association with trait 1 but not with trait 2
PP.H2.abf 0.793985073521508 association with trait 2 but not with trait 1
PP.H3.abf 0.0424295985213398 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.163585327957141 association with trait 1 and trait 2, one shared SNP

white blood cell count

LV47

PPs in coloclization
note
nsnps 433 NA
PP.H0.abf 3.63129877087665e-07 no association with either trait
PP.H1.abf 2.19315268516939e-08 association with trait 1 but not with trait 2
PP.H2.abf 0.282771374037345 association with trait 2 but not with trait 1
PP.H3.abf 0.0163773582482725 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.700850882652979 association with trait 1 and trait 2, one shared SNP
PPs in coloclization
note
nsnps 807 NA
PP.H0.abf 4.0905285744147e-08 no association with either trait
PP.H1.abf 4.27896985899567e-09 association with trait 1 but not with trait 2
PP.H2.abf 0.198547668391827 association with trait 2 but not with trait 1
PP.H3.abf 0.019987966162058 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.781464320261861 association with trait 1 and trait 2, one shared SNP

Results - pval < 5e-8 & association test covariants: 10 PCs

Summary table

I used ‘qvalue’ R package to compute the fdr from p-values for each SNP and made a table to show the number of SNPs that pass the threshold. The thresholds are ‘fdr < 0.1’,‘fdr < 0.2’,‘pval < 5e-8’.

Info table

For each trait, I made a table to show the info of snps with fdr>0.2 in the factor ~ SNP + genotype pcs association test. For each trait,The LVs have more than one significant SNPs with FDR<0.2 are included.

The suffix ’_assoc’ here means that results are from factor ~ SNP + genotype pcs association test. The suffix ’_gwas’ here means results are from original GWAS results files. For EUR.CD, EUR.IBD, EUR.UC, the effectsize_gwas here means ‘ln(OR)’, for others, it means ‘beta’.

‘snp_ld’ here means the snps that in LD with the snp in each line.’ld_r2’ means the LD r-squared which is corresponding to the ‘snp_ld’ column. ‘cis-eqtl’ column indicates whether the snp is a cis-eqtl according to GTEx data. ‘cis_gene_hgnc’ and ‘cis_gene_hgnc’ is the genes that the snp influence when it act as cis-eqtl. ‘func’ and ‘func_gene’ are obtained from ANNOVAR, which indicating the snp function within the genes.

Enrichment analysis

For some promising trait-factor pairs (i.e. BMI-LV3, LV27, RBC-LV82, AsthmaLV68, WBC-LV119, Lymphocyte-LV23, LV78), I did enrichment analysis with WebGestalt. The analysis are under different settings:

  1. ORA: For each factor, I sorted the genes in 1 by their loadings, taking the top 25% as the gene set. The function database here are Reactome pathway; Disgenet + GLAD4U + OMIM disease dataset; geneontology biological process. The reference set affy hugene 2 0 st v1. Minimum number of genes for a category is 5, maximum number of genes for a category is 2000(default settings). All categories that have fdr<0.01 are listed.

BMI-LV3

BMI-LV27

BMI-LV76

BMI-LV90

RBC-LV82

Asthma-LV68

WBC-LV119

Lymphocyte-LV23

Lymphocyte-LV78

  1. GSEA: For each factor, I used all genes (zero loading genes included), taking the gene loadings as their gene score in GSEA. The function database is Reactome pathway.

BMI-LV3-Reactome BMI-LV3-GO BMI-LV3-phenotype BMI-LV3-disease-Disgenet

BMI-LV27-Reactome BMI-LV27-GO BMI-LV27-phenotype BMI-LV27-disease-Disgenet

BMI-LV76-Reactome BMI-LV76-GO BMI-LV76-phenotype BMI-LV76-disease-Disgenet

RBC-LV82-Reactome RBC-LV82-GO RBC-LV82-phenotype RBC-LV82-disease-Disgenet

Asthma-LV68-Reactome Asthma-LV68-GO Asthma-LV68-phenotype Asthma-LV68-disease-Disgenet

WBC-LV119-Reactome WBC-LV119-GO WBC-LV119-phenotype WBC-LV119-disease-Disgenet

Lymphocyte-LV23-Reactome Lymphocyte-LV23-GO Lymphocyte-LV23-phenotype Lymphocyte-LV23-disease-Disgenet

Lymphocyte-LV78 Lymphocyte-LV78-GO Lymphocyte-LV78-phenotype Lymphocyte-LV78-disease-Disgenet

  1. GSEA: For each factor, I used all genes (zero loading genes included), taking the gene loadings as their gene score in GSEA. For those loadings = 0, I assigned a random number from normal distribution N(0,1e-5) to it to avoid ties. The function database is Reactome pathway.

BMI-LV3

BMI-LV27

BMI-LV76

RBC-LV82

Asthma-LV68

WBC-LV119

Lymphocyte-LV23

Lymphocyte-LV78

Effect size plots

For each trait, I made a plot of association with LV(indicating by beta in GWAS) vs association with trait(indicating by ln(odds ratio) or beta in GWAS) to show if the variants have the correlated effect direction. The effect sizes of Catalog GWAS and factor association tests are harmonized by TwoSampleMR R package to make the effect alleles in these two analysis identical. The LVs have more than one significant SNPs with FDR<0.2 are included in the plotting.Besides, for each plots, I fitted the points with intercept = 0. The pvalues and r-squared are shown on the plots.

BMI

eosinophil count(eo)

Crohn’s disease(CD)

IBD

Ulcerative colitist(UC)

granulocyte count(gran)

lymphocyte count(lymph)

myeloid white cell count (myeloid wbc)

neutrophil count (neut)

None of the LVs have >1 SNPs at FDR<0.2.

plt

None of the LVs have >1 SNPs at FDR<0.2.

rbc

asthma

wbc

effect size plots – more SNPs

For some promising trait-factor pairs (i.e. BMI-LV3, LV27,LV76, RBC-LV82, Asthma-LV68, WBC-LV119, Lymphocyte-LV23, LV78), I relaxed the fdr threshold of the SNPs that used to make effect size plots(from 0.2 to 0.3/0.5)

I also made a plot to show the the distribution of the SNPs’ fdr.

BMI

Crohn’s disease(CD)

The CD~lv88 is not very promising pairs when considering SNPs at fdr<0.2, but the fitting result at fdr <0.5 is better than the former result . So I post the plots here too.

lymphocyte count(lymph)

red blood cell count(wbc)

The rbc~lv42 and rbc~lv59 pairs are not very promising pairs when considering SNPs at fdr<0.2, but the fitting results at fdr <0.5 are better than the former results. So I post the plots here too.

asthma

white blood cell count

Effect size plots - checking the reverse causality

To check if the effect size correlation is due to reverse causality: i.e. trait -> LV (trait causally affect LV), instead of LV -> trait (which is what we like to see). I used all SNPs associated with traits(pval<5E-8). The x-axis is the effects of these SNPs on trait, and y-axis is the effects on LV.

Some pair show p < 0.05, the result may be driven by the possible causal effect of LV -> trait. To test this, I removed the SNPs that are associated with LVs at FDR < 0.2 and made the plots again.

BMI

Crohn’s disease(CD)

The CD~lv88 is not very promising pairs when considering SNPs at fdr<0.2, but the fitting result at fdr <0.5 is better than the former result . So I post the plots here too.

lymphocyte count(lymph)

red blood cell count(wbc)

The rbc~lv42 and rbc~lv59 pairs are not very promising pairs when considering SNPs at fdr<0.2, but the fitting results at fdr <0.5 are better than the former results. So I post the plots here too.

asthma

white blood cell count

Results - pval < 5e-8 & association test covariants: 10 PCs + GTEx:Sequencing platform,Sequencing protocol,Sex

Summary table

I used ‘qvalue’ R package to compute the fdr from p-values for each SNP and made a table to show the number of SNPs that pass the threshold. The thresholds are ‘fdr < 0.1’,‘fdr < 0.2’,‘pval < 5e-8’.

Info table

For each trait, I made a table to show the info of snps with fdr>0.2 in the factor ~ SNP + genotype pcs association test. For each trait,The LVs have more than one significant SNPs with FDR<0.2 are included.

The suffix ’_assoc’ here means that results are from factor ~ SNP + genotype pcs association test. The suffix ’_gwas’ here means results are from original GWAS results files. For EUR.CD, EUR.IBD, EUR.UC, the effectsize_gwas here means ‘ln(OR)’, for others, it means ‘beta’.

‘snp_ld’ here means the snps that in LD with the snp in each line.’ld_r2’ means the LD r-squared which is corresponding to the ‘snp_ld’ column. ‘cis-eqtl’ column indicates whether the snp is a cis-eqtl according to GTEx data. ‘cis_gene_hgnc’ and ‘cis_gene_hgnc’ is the genes that the snp influence when it act as cis-eqtl. ‘func’ and ‘func_gene’ are obtained from ANNOVAR, which indicating the snp function within the genes.

Effect size plots

For each trait, I made a plot of association with LV(indicating by beta in GWAS) vs association with trait(indicating by ln(odds ratio) or beta in GWAS) to show if the variants have the correlated effect direction. The effect sizes of Catalog GWAS and factor association tests are harmonized by TwoSampleMR R package to make the effect alleles in these two analysis identical. The LVs have more than one significant SNPs with FDR<0.2 are included in the plotting.Besides, for each plots, I fitted the points with intercept = 0. The pvalues and r-squared are shown on the plots.

BMI

eosinophil count(eo)

Crohn’s disease(CD)

IBD

Ulcerative colitist(UC)

granulocyte count(gran)

None of the LVs have >1 SNPs at FDR<0.2.

lymphocyte count(lymph)

myeloid white cell count (myeloid wbc)

neutrophil count (neut)

plt

rbc

T2D

None of the LVs have >1 SNPs at FDR<0.2.

asthma

wbc

Effect size plots- more SNPs

For some promising trait-factor pairs , I relaxed the fdr threshold of the SNPs that used to make effect size plots(from 0.2 to 0.3/0.5)

BMI

The BMI~lv90 is not very promising pairs when considering SNPs at fdr<0.2, but the fitting result at fdr <0.5 is better than the former result . So I post the plots here too.

lymphocyte count(lymph)

red blood cell count(wbc)

asthma

white blood cell count

Effect size plots - checking the reverse causality

To check if the effect size correlation is due to reverse causality: i.e. trait -> LV (trait causally affect LV), instead of LV -> trait (which is what we like to see). I used all SNPs associated with traits(pval<5E-8). The x-axis is the effects of these SNPs on trait, and y-axis is the effects on LV.

Some pair show p < 0.05, the result may be driven by the possible causal effect of LV -> trait. To test this, I removed the SNPs that are associated with LVs at FDR < 0.2 and made the plots again.

BMI

lymphocyte count(lymph)

red blood cell count(wbc)

asthma

white blood cell count


sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] workflowr_1.6.2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5        rstudioapi_0.11   whisker_0.3-2     knitr_1.30       
 [5] magrittr_1.5      R6_2.4.1          rlang_0.4.8       highr_0.8        
 [9] stringr_1.4.0     tools_3.6.1       DT_0.15           xfun_0.18        
[13] git2r_0.26.1      crosstalk_1.1.0.1 htmltools_0.5.0   ellipsis_0.3.1   
[17] rprojroot_1.3-2   yaml_2.2.1        digest_0.6.25     tibble_3.0.3     
[21] lifecycle_0.2.0   crayon_1.3.4      later_1.1.0.1     htmlwidgets_1.5.2
[25] vctrs_0.3.4       promises_1.1.1    fs_1.5.0          glue_1.4.2       
[29] evaluate_0.14     rmarkdown_1.13    stringi_1.5.3     compiler_3.6.1   
[33] pillar_1.4.6      backports_1.1.10  jsonlite_1.7.1    httpuv_1.5.1     
[37] pkgconfig_2.0.3